Search CORE

79 research outputs found

Efficient Batch Query Answering Under Differential Privacy

Author: Li Chao
Miklau Gerome
Publication venue
Publication date: 07/03/2011
Field of study

Differential privacy is a rigorous privacy condition achieved by randomizing query answers. This paper develops efficient algorithms for answering multiple queries under differential privacy with low error. We pursue this goal by advancing a recent approach called the matrix mechanism, which generalizes standard differentially private mechanisms. This new mechanism works by first answering a different set of queries (a strategy) and then inferring the answers to the desired workload of queries. Although a few strategies are known to work well on specific workloads, finding the strategy which minimizes error on an arbitrary workload is intractable. We prove a new lower bound on the optimal error of this mechanism, and we propose an efficient algorithm that approaches this bound for a wide range of workloads.Comment: 6 figues, 22 page

arXiv.org e-Print Archive

ScholarWorks@UMass Amherst

An Adaptive Mechanism for Accurate Query Answering under Differential Privacy

Author: Li Chao
Miklau Gerome
Publication venue
Publication date: 01/01/2012
Field of study

We propose a novel mechanism for answering sets of count- ing queries under differential privacy. Given a workload of counting queries, the mechanism automatically selects a different set of "strategy" queries to answer privately, using those answers to derive answers to the workload. The main algorithm proposed in this paper approximates the optimal strategy for any workload of linear counting queries. With no cost to the privacy guarantee, the mechanism improves significantly on prior approaches and achieves near-optimal error for many workloads, when applied under (\epsilon, \delta)-differential privacy. The result is an adaptive mechanism which can help users achieve good utility without requiring that they reason carefully about the best formulation of their task.Comment: VLDB2012. arXiv admin note: substantial text overlap with arXiv:1103.136

arXiv.org e-Print Archive

CiteSeerX

ScholarWorks@UMass Amherst

Optimal error of query sets under the differentially-private matrix mechanism

Author: Li Chao
Miklau Gerome
Publication venue
Publication date: 01/01/2012
Field of study

A common goal of privacy research is to release synthetic data that satisfies a formal privacy guarantee and can be used by an analyst in place of the original data. To achieve reasonable accuracy, a synthetic data set must be tuned to support a specified set of queries accurately, sacrificing fidelity for other queries. This work considers methods for producing synthetic data under differential privacy and investigates what makes a set of queries "easy" or "hard" to answer. We consider answering sets of linear counting queries using the matrix mechanism, a recent differentially-private mechanism that can reduce error by adding complex correlated noise adapted to a specified workload. Our main result is a novel lower bound on the minimum total error required to simultaneously release answers to a set of workload queries. The bound reveals that the hardness of a query workload is related to the spectral properties of the workload when it is represented in matrix form. The bound is most informative for

(\epsilon,\delta)

-differential privacy but also applies to

\epsilon

-differential privacy.Comment: 35 pages; Short version to appear in the 16th International Conference on Database Theory (ICDT), 201

arXiv.org e-Print Archive

CiteSeerX

ScholarWorks@UMass Amherst

Boosting the Accuracy of Differentially-Private Histograms Through Consistency

Author: Hay Michael
Miklau Gerome
Rastogi Vibhor
Suciu Dan
Publication venue
Publication date: 01/01/2009
Field of study

We show that it is possible to significantly improve the accuracy of a general class of histogram queries while satisfying differential privacy. Our approach carefully chooses a set of queries to evaluate, and then exploits consistency constraints that should hold over the noisy output. In a post-processing phase, we compute the consistent input most likely to have produced the noisy output. The final output is differentially-private and consistent, but in addition, it is often much more accurate. We show, both theoretically and experimentally, that these techniques can be used for estimating the degree sequence of a graph very precisely, and for computing a histogram that can support arbitrary range queries accurately.Comment: 15 pages, 7 figures, minor revisions to previous versio

arXiv.org e-Print Archive

CiteSeerX

ScholarWorks@UMass Amherst

A Theory of Pricing Private Data

Author: Li Chao
Li Daniel Yang
Miklau Gerome
Suciu Dan
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 17/12/2012
Field of study

Personal data has value to both its owner and to institutions who would like to analyze it. Privacy mechanisms protect the owner's data while releasing to analysts noisy versions of aggregate query results. But such strict protections of individual's data have not yet found wide use in practice. Instead, Internet companies, for example, commonly provide free services in return for valuable sensitive information from users, which they exploit and sometimes sell to third parties. As the awareness of the value of the personal data increases, so has the drive to compensate the end user for her private information. The idea of monetizing private data can improve over the narrower view of hiding private data, since it empowers individuals to control their data through financial means. In this paper we propose a theoretical framework for assigning prices to noisy query answers, as a function of their accuracy, and for dividing the price amongst data owners who deserve compensation for their loss of privacy. Our framework adopts and extends key principles from both differential privacy and query pricing in data markets. We identify essential properties of the price function and micro-payments, and characterize valid solutions.Comment: 25 pages, 2 figures. Best Paper Award, to appear in the 16th International Conference on Database Theory (ICDT), 201

arXiv.org e-Print Archive

CiteSeerX

Crossref

Rule-Based Application Development using Webdamlog

Author: Abiteboul Serge
Antoine Émilien
Miklau Gerome
Stoyanovich Julia
Testard Jules
Publication venue
Publication date: 01/01/2013
Field of study

We present the WebdamLog system for managing distributed data on the Web in a peer-to-peer manner. We demonstrate the main features of the system through an application called Wepic for sharing pictures between attendees of the sigmod conference. Using Wepic, the attendees will be able to share, download, rate and annotate pictures in a highly decentralized manner. We show how WebdamLog handles heterogeneity of the devices and services used to share data in such a Web setting. We exhibit the simple rules that define the Wepic application and show how to easily modify the Wepic application.Comment: SIGMOD - Special Interest Group on Management Of Data (2013

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

Introducing Access Control in Webdamlog

Author: Abiteboul Serge
Antoine Émilien
Miklau Gerome
Moffitt Vera Zaychik
Stoyanovich Julia
Publication venue
Publication date: 31/07/2013
Field of study

We survey recent work on the specification of an access control mechanism in a collaborative environment. The work is presented in the context of the WebdamLog language, an extension of datalog to a distributed context. We discuss a fine-grained access control mechanism for intentional data based on provenance as well as a control mechanism for delegation, i.e., for deploying rules at remote peers.Comment: Proceedings of the 14th International Symposium on Database Programming Languages (DBPL 2013), August 30, 2013, Riva del Garda, Trento, Ital

arXiv.org e-Print Archive

CiteSeerX

INRIA a CCSD electronic archive server